DomEx: Extraction of Sentiment Lexicons for Domains and Meta-Domains
نویسندگان
چکیده
In this paper we describe a DomEx sentiment lexicon extractor, where a new approach for domain-specific sentiment lexicon extraction is implemented. Sentiment lexicon extraction is based on the machine learning model comprising a set of statistical and linguistic features. The extraction model is trained in the movie domain and then can be utilized to other domains. The system can work with various domains and languages after part of speech tagging. Finally, the system gives possibility to combine the sentiment lexicons from similar domains to obtain one general lexicon for the corresponding meta-domain. TITLE AND ABSTRACT IN RUSSIAN DomEx: И я ы ы В данной работе мы описываем систему для извлечения оценочных слов DomExз в которой реализован новый подход для формирования оценочного словаря. звлечение оценочной лексики основано на машинном обучении с использованием набора статистических и лингвистических признаковй одель для извлечения обучается в предметной области о фильмах и затем может быть использована в других предметных областяхй истема может работать с различными предметными областями и языками после этапа морфологической обработкий аконецз система дает возможность комбинировать списки оценочных слов из похожих предметных областей для формирования одного, общего словаря для соответствующей мета-областий
منابع مشابه
Extraction of Russian Sentiment Lexicon for Product Meta-Domain
In this paper we consider a new approach for domain-specific sentiment lexicon extraction in Russian. We propose a set of statistical features and algorithm combination that can discriminate sentiment words in a specific domain. The extraction model is trained in the movie domain and then utilized to other domains. We evaluate the quality of obtained sentiment vocabularies intrinsically. Finall...
متن کاملExtraction of Domain-specific Opinion Words for Similar Domains
In this paper we consider a new approach for domain-specific opinion word extraction in Russian. We suppose that some domains have similar sentiment lexicons and utilize this fact to build an opinion word vocabulary for a group of domains. We train our model in movie domain and then utilize it to book and game domains. Obtained word list quality is comparable with quality of initial domain list.
متن کاملMHSubLex: Using Metaheuristic Methods for Subjectivity Classification of Microblogs
In Web 2.0, people are free to share their experiences, views, and opinions. One of the problems that arises in web 2.0 is the sentiment analysis of texts produced by users in outlets such as Twitter. One of main the tasks of sentiment analysis is subjectivity classification. Our aim is to classify the subjectivity of Tweets. To this end, we create subjectivity lexicons in which the words into ...
متن کاملInformal Multilingual Multi-domain Sentiment Analysis
This paper addresses the problem of sentiment analysis in an informal setting in multiple domains and in two languages. We explore the influence of using background knowledge in the form of different sentiment lexicons, as well as the influence of various lexical surface features. We evaluate several different feature set combination strategies. We show that the improvement resulting from using...
متن کاملУточнение русскоязычных словарей эмоциональной лексики с использованием тезауруса RuThes (Refinement of Russian Sentiment Lexicons Using RuThes Thesaurus)
The paper describes a combined approach to extraction of a domain-specific sentiment lexicon. At first, an initial version of a domainspecific lexicon is obtained by application of a supervised model. At the second stage, the ordered list of sentiment words is refined using the thesaurus information. This combined model is applied to several domains and at last the domain-specific sentiment lex...
متن کامل